DNN based Speaker Recognition on Short Utterances

نویسندگان

  • Ahilan Kanagasundaram
  • David Dean
  • Sridha Sridharan
  • Clinton Fookes
چکیده

This paper investigates the effects of limited speech data in the context of speaker verification using deep neural network (DNN) approach. Being able to reduce the length of required speech data is important to the development of speaker verification system in real world applications. The experimental studies have found that DNN-senone-based Gaussian probabilistic linear discriminant analysis (GPLDA) system respectively achieves above 50% and 18% improvements in EER values over GMM-UBM GPLDA system on NIST 2010 coreextcoreext and truncated 15sec-15sec evaluation conditions. Further when GPLDA model is trained on short-length utterances (30sec) rather than full-length utterances (2min), DNNsenone GPLDA system achieves above 7% improvement in EER values on truncated 15sec-15sec condition. This is because short length development i-vectors have speaker, session and phonetic variation and GPLDA is able to robustly model those variations. For several real world applications, longer utterances (2min) can be used for enrollment and shorter utterances (15sec) are required for verification, and in those conditions, DNN-senone GPLDA system achieves above 26% improvement in EER values over GMM-UBM GPLDA systems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DNN i-Vector Speaker Verification with Short, Text-Constrained Test Utterances

We investigate how to improve the performance of DNN ivector based speaker verification for short, text-constrained test utterances, e.g. connected digit strings. A text-constrained verification, due to its smaller, limited vocabulary, can deliver better performance than a text-independent one for a short utterance. We study the problem with “phonetically aware” Deep Neural Net (DNN) in its cap...

متن کامل

Text-Available Speaker Recognition System for Forensic Applications

This paper examines a text-available speaker recognition approach targeting scenarios where the transcripts of test utterances are either available or obtainable through manual transcription. Forensic speaker recognition is one of such applications where the human supervision can be expected. In our study, we extend an existing Deep Neural Network (DNN) ivector-based speaker recognition system ...

متن کامل

Speaker Verification Using Short Utterances with DNN-Based Estimation of Subglottal Acoustic Features

Speaker verification in real-world applications sometimes deals with limited duration of enrollment and/or test data. MFCC-based i-vector systems have defined the state-of-the-art for speaker verification, but it is well known that they are less effective with short utterances. To address this issue, we propose a method to leverage the speaker specificity and stationarity of subglottal acoustic...

متن کامل

Content Normalization for Text-independent Speaker Verification

In the past few years, Deep Neural Network (DNN) based ivector Speaker Verification (SV) systems have shown to provide state-of-the-art performance. However, error rates increase drastically for short duration recordings. In this paper, we improve the i-vector approach for short utterances, (i) by using smoothed DNN posteriors for i-vector extraction, and (ii) by normalizing the content of the ...

متن کامل

End-to-end DNN Based Speaker Recognition Inspired by i-vector and PLDA

Recently several end-to-end speaker verification systems based on deep neural networks (DNNs) have been proposed. These systems have been proven to be competitive for text-dependent tasks as well as for text-independent tasks with short utterances. However, for text-independent tasks with longer utterances, end-to-end systems are still outperformed by standard i-vector + PLDA systems. In this w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1610.03190  شماره 

صفحات  -

تاریخ انتشار 2016